Shared memory having multiple access configurations

ABSTRACT

An apparatus includes a first processor that accesses memory according to a first clock frequency, a second processor that accesses memory according to a second clock frequency, and a memory device is configurable to selectively operate according to the first clock frequency or the second clock frequency. A memory controller enables dynamic configuration of organization of the memory device to allow a first portion of the memory device to be accessed by the first processor according to the first clock frequency and a second portion of the memory device to be accessed by the second processor according to the second clock frequency.

BACKGROUND

This description relates to shared memory having multiple accessconfigurations.

A system on a chip (SoC) can have multiple embedded processors in whicheach processor may have unique memory access timing and data bus widthrequirements for accessing memory. In one implementation, two embeddedprocessors each access a separate embedded memory module according toits native access timing scheme. In another implementation, twoprocessors having different memory access timing schemes can access ashared memory device using a bridging process. For example, suppose afirst processor is designed to access memory according to a first clockfrequency, a second processor is designed to access memory according toa second clock frequency, and the shared memory module is configured tobe accessed according to the first clock frequency. The first processorcan access the memory according to its native memory access timingscheme. The second processor can access the memory module using abridging process in which requests are converted from the second clockdomain to the first clock domain that is compatible with the memorymodule, and responses from the memory module are converted from thefirst clock domain back to the second clock domain that is compatiblewith the second processor.

SUMMARY

In general, in one aspect, an apparatus includes a first processor thataccesses memory according to a first clock frequency; a second processorthat accesses memory according to a second clock frequency; a memorydevice configurable to selectively operate according to the first clockfrequency or the second clock frequency; and a memory controller toenable dynamic configuration of organization of the memory device toallow a first portion of the memory device to be accessed by the firstprocessor according to the first clock frequency and a second portion ofthe memory device to be accessed by the second processor according tothe second clock frequency.

Implementations may include one or more of the following features. Thememory controller may enable re-configuration of the organization of thememory device to adjust the sizes of the first and second portions whilethe first processor is executing an application program. The memorycontroller may enable re-configuration of the organization of the memorydevice to adjust the sizes of the first and second portions uponstart-up of the memory device and the memory controller. The memorydevice may include a plurality of memory banks and the memory controllermay allocate the first and second portions of the memory device alongboundaries of memory banks. Multiplexers may be provided, in which eachmultiplexer is associated with a memory bank, and each multiplexerselects from a first clock signal having the first clock frequency and asecond clock signal having the second clock frequency and passing theselected clock signal to the corresponding memory bank. The memorycontroller may enable dynamic configuration of the organization of thememory device to allow a third portion of the memory device to beaccessed exclusively by the first processor according to the first clockfrequency.

The memory controller may enable re-configuration of the memory deviceto adjust the sizes of the first and second portions, and the memorycontroller may include an address decoder to receive memory accessrequests from the first processor and determine whether the memoryaccess requests are for accessing the first or third portion of thememory device. The memory controller may reconfigure the organization ofthe memory device to re-allocate the sizes of the first and secondportions of the memory device using five or less clock cycles accordingto the slower of the first and second clock frequencies. The firstportion of the memory device may be accessed using a first bus width,and the second portion of the memory device may be accessed using asecond bus width. The memory controller may monitor execution of memoryaccess instructions, and upon receiving a signal to switch from a firstmemory access clock frequency to a second memory access clock frequencyfor accessing a segment of the memory device, determine whether aprevious memory access instruction using the first memory access clockfrequency for the segment of the memory device has been completed beforeswitching to the second memory access clock frequency.

The memory controller can include an arbitration unit to arbitrateaccesses to shared memory banks of the memory by the first and secondprocessors. The memory controller can include a first clock domainrequester that operates according to the first clock frequency andpasses memory access instructions from the first processor to the sharedmemory banks, and a second clock domain requestor that operatesaccording to the second clock frequency and passes memory accessinstructions from the second processor to the shared memory banks. Onlyone of the first and second clock domain requesters is granted access toany particular shared memory bank at a given time. The first clockdomain requester can be granted access to a particular shared memorybank until the second clock domain requester requests access to theparticular shared memory bank, upon which the first and second clockdomain requesters can perform an arbitration process to determinewhether the second clock domain requestor can be granted access to theparticular shared memory bank. The second clock domain requester cansend an arbitration request signal to the first clock domain requester,and the first clock domain requester can send an arbitration grantsignal to the second clock domain requester after the first clock domainrequester determines that memory access requests from the firstprocessor to the particular shared memory bank have been completed. Thememory controller can include one or more synchronization units thatsynchronize the arbitration request signal and arbitration grant signalacross different clock domains. The memory device may be configurable toselectively operate according to three or more clock frequencies, andthe memory controller may enable dynamic configuration of the memorydevice to allow a portion of the memory device to be accessed accordingto any of the clock frequencies in which the memory device is operable.

In general, in another aspect, an apparatus includes a first processorthat accesses memory according to a first timing scheme; a secondprocessor that accesses memory according to a second timing scheme; amemory device having shared memory banks that can be accessed by eitherthe first or second processors, the shared memory banks configurable toselectively operate according to the first timing scheme or the secondtiming scheme; and a memory controller to enable dynamic configurationof organization of the shared memory banks to allow a first set ofshared memory banks to be accessed by the first processor according tothe first timing scheme and a second set of the shared memory banks tobe accessed by the second processor according to the second timingscheme.

Implementations can include one or more of the following features. Thefirst processor can access the shared memory banks according to a firstclock frequency, and the second processor can access the shared memorybanks according to a second clock frequency. The memory controller caninclude a first clock domain requester that operates according to thefirst clock frequency and passes memory access instructions from thefirst processor to the shared memory banks, and a second clock domainrequester that operates according to the second clock frequency andpasses memory access instructions from the second processor to theshared memory banks. Only one of the first and second clock domainrequestors is granted access to any particular shared memory bank at agiven time. When the first processor is granted access to a particularshared memory bank, and the second processor seeks access to theparticular shared memory bank, the second clock domain requestor sendsan arbitration request signal to the first clock domain requester, andthe first clock domain requester sends an arbitration grant signal tothe second clock domain requester after the first clock domain requesterdetermines that memory access requests from the first processor to theparticular shared memory bank have been completed. The memory controllercan include one or more synchronization units that synchronize thearbitration request signal and arbitration grant signal across differentclock domains.

In general, in another aspect, a method includes dynamically configuringorganization of a memory device to allow a first portion of the memorydevice to be accessed by a first processor according to a first timingscheme and a second portion of the memory device to be accessed by asecond processor according to a second timing scheme. A first memoryaccess instruction is received from the first processor, and the firstportion of the memory device is accessed responsive to the first memoryaccess instruction according to the first timing scheme. A second memoryaccess instruction is received from the second processor, and the secondportion of the memory device is accessed responsive to the second memoryaccess instruction according to the second timing scheme.

Implementations may include one or more of the following features.Accessing the first portion of the memory device according to the firsttiming scheme may include accessing the first portion of the memorydevice according to a first clock frequency, and accessing the secondportion of the memory device according to the second timing scheme mayinclude accessing the second portion of the memory device according to asecond clock frequency. The method may include, for each memory bank inthe memory device, selecting one of a first clock signal having thefirst clock frequency and a second clock signal having the second clockfrequency, and passing the selected clock signal to the memory bank. Themethod may include reconfiguring the organization of the memory deviceto adjust the sizes of the first and second portions while executing anapplication program by the first processor. The method may includereconfiguring the organization of the memory device to adjust the sizesof the first and second portions upon start-up of the memory device andthe memory controller. The method may include allocating the first andsecond portions of the memory device along boundaries of memory banks ofthe memory device. The method may include reconfiguring the memorydevice to re-allocate the sizes of the first and second portions of thememory device using five or less clock cycles according to the slower ofthe first and second clock frequencies. The method may include accessingthe first portion of the memory device using a first bus width, andaccessing the second portion of the memory device using a second buswidth. The memory controller may monitor execution of memory accessinstructions, and upon receiving a signal to switch from a first memoryaccess timing scheme to a second memory access timing scheme, determinewhether memory access instructions associated with the first memoryaccess timing scheme have been completed before switching to the secondmemory access timing scheme.

In general, in another aspect, a method includes dynamically configuringorganization of a memory device having shared memory banks that areshared between a first processor and a second processor to allow a firstset of the shared memory banks to be accessed by the first processoraccording to a first clock frequency and a second set of the sharedmemory banks to be accessed by the second processor according to asecond clock frequency; and arbitrating requests for accessing theshared memory banks by the first and second processors, the requestsfrom the first processor being synchronized according to the first clockfrequency and the requests from the second processor being synchronizedaccording to the second clock frequency.

Implementations can include one or more of the following features. Themethod can include granting access to a particular shared memory bank tothe first processor until the second processor requests access to theparticular shared memory bank, and performing an arbitration handshaketo determine whether the second processor is granted access to theparticular shared memory bank. Performing the arbitration handshake caninclude sending an arbitration request signal from a second clock domainrequester to a first clock domain requester, and sending an arbitrationgrant signal from the first clock domain requester to the second clockdomain requester after the first clock domain requester determines thatmemory access requests from the first processor to the particular sharedmemory bank have been completed. The method can include granting accessto the particular shared memory bank to the second processor until thefirst processor requests access to the particular shared memory bank,sending an arbitration request signal from the first clock domainrequester to the second clock domain requester, and sending anarbitration grant signal from the second clock domain requester to thefirst clock domain requester after the second clock domain requesterdetermines that memory access requests from the second processor to theparticular shared memory bank have been completed. The method includessynchronizing the arbitration request signal and the arbitration grantsignal across different clock domains.

In general, in another aspect, an apparatus includes a memory devicehaving a plurality of portions each configurable to operate according tomultiple timing schemes; and means for enabling dynamic configuration oforganization of the memory device to allow each portion of the memorydevice to be dynamically configured to be accessed according to a firsttiming scheme or a second timing scheme while executing an applicationprogram.

These and other aspects and features, and combinations of them, may beexpressed as methods, apparatus, systems, means for performingfunctions, program products, and in other ways.

Advantages of the aspects, systems, and methods may include one or moreof the following. The latency of memory access for each of the mastersusing the memory can be lowered because there is no need to convertmemory requests from one clock domain to another. A single memory devicecan be shared by multiple masters, so the number of memory devices canbe reduced, and the cost of the overall system can be reduced.

DESCRIPTION OF DRAWINGS

FIGS. 1 to 3 are schematic diagrams of an example system havingconfigurable memory architecture.

FIGS. 4 and 5 show various examples of memory access configurations.

FIG. 6A is a block diagram of an example memory controller.

FIG. 6B is a diagram of a clock domain crossing and bank arbitrationunit.

FIGS. 7-10, 11A, 11B, 12, 13A, 13B, 14, 15A, 15B, and 16 are exampletiming diagrams.

FIGS. 17 and 18 show various examples of memory access configurations.

DETAILED DESCRIPTION

Referring to FIG. 1, an example system 100 includes a first dataprocessor 102 that accesses memory according to a first access timingscheme and data bus width, and a second data processor 104 that accessesmemory according to a second access timing scheme and data bus width. Amemory controller 106 dynamically configures the organization of amemory device 108 such that a first portion 110 of the memory device 108can be accessed by the first data processor 102 according to its memoryaccess requirements, and a second portion 112 of the memory device 108can be accessed by the second data processor 104 according to its memoryaccess requirements. The configuration of the first and second portionscan be set during start-up of the system 100 or adjusted dynamicallywhile application programs are executed by the first and secondprocessors.

In some examples, the system 100 can be a system-on-a-chip (SoC) inwhich the first processor 102 is an embedded general purposemicroprocessor (MCU), the second processor 104 is an embedded digitalsignal processor (DSP), and the memory device 108 is an embedded memorymodule. The first and second processors have different access timing anddata width requirements. For example, the first processor 102 uses thefirst portion 110 of the memory as a system memory (or L3 memory), andaccesses (either read or write) the first portion 110 of the memorydevice 108 through a first interface bus 130 having a first bus width.The second processor 104 uses the second portion 112 of the memorydevice 108 as a level-2 (L2) cache memory, and accesses (either read orwrite) the second portion 112 of the memory device 108 through a secondbus 132 having a second bus width. In some examples, the first bus 130is 32 bits wide, and the second bus 132 is 64 bits wide. In this case,access to the L2 memory 112 through the 64-bit wide bus 132 is fasterthan access to the L3 memory 110 through the 32-bit wide bus 132.

In some implementations, the memory device 108 has a 32-bit wideread/write port 114 for accessing the L3 memory 110 and a 64-bit wideread/write port 116 for accessing the L2 memory 112. The L3 memory 110receives a first clock signal 118, which has a frequency thatcorresponds to the clock frequency that the first processor 102 uses toaccess the memory. The L2 memory 112 receives a second clock signal 120,which has a second frequency that corresponds to the clock frequencythat the second processor 104 uses to access the memory. This allowsparallel access to the L2 and L3 memory portions, in which each of theL2 and L3 memory portions is accessed according to its access timingscheme and bus width.

The example in FIG. 1 shows the memory device 108 being organized intotwo portions 110 and 112. The memory device 108 can also be organizedinto three or more portions that can be accessed using three or moreaccess timing schemes and bus widths (assuming there are additionalclock signals and data buses). Such a shared memory device may haveinterfaces to L1/L2/L3 clock domains, L2/L3/L4 clock domains,L1/L2/L3/L4 clock domains, and so on. The clock signals of differentdomains can be either synchronous or asynchronous with respect to eachother.

Referring to FIG. 2, it is possible to allocate a portion 110 of thememory 108 as L3 memory to be used exclusively by the first processor102, a portion 112 of the memory 108 as L2 memory to be used exclusivelyby the second processor 104, and a portion 290 of the memory 108 asshared L2-L3 memory to be used by both the first and second processors.A multiplexer 128 is used to multiplex the first clock signal CLK 1(118) and the second clock signal CLK 2 (120), and depending on whichprocessor is accessing the shared memory portion 290, send a selectedclock signal to the shared memory portion 290. In this example,depending on application, the memory controller 106 can dynamicallyadjust the ratio of L2 to L3 memory in the shared portion 290, but doesnot change the amount of L3 and L2 memory reserved for exclusive use bythe first and second processors, respectively.

FIG. 3 is a schematic diagram of the system 100 illustrating an examplearchitecture for enabling the memory device 108 to be re-configureddynamically. The memory device 108 can include several memory banks,such as 114 a, 114 b, 114 c, etc., collectively referenced as 114. Amemory controller 106 is configured to enable each memory bank 114 to beselectively accessed through the first data bus 130 or the second databus 132. Each data bus (e.g., 126 a) between the memory bank 114 and thememory controller 106 is sufficiently wide to support the maximumdata-width operation. For example, each data bus can be 64-bit wide toallow either 32-bit or 64-bit operations.

In some implementations, the separation of the first and second portionsis along a boundary of the memory banks so that L2 and L3 memoryoperations can generally be concurrent to increase system efficiency.Each memory bank receives a clock signal from a multiplexer (e.g., 128a), which multiplexes a first clock signal (CLK 1) 118 and a secondclock signal (CLK 2) 120. In some examples, the first clock signal has afrequency of 150 MHz and the second clock signal has a frequency of 75MHz. The first clock signal CLK 1 (118) and the second clock signal CLK2 (120) can be either synchronous or asynchronous.

Other than clock frequencies, the L2 and L3 memory can have otheraccessing requirements that are different. For example, the L3 memorymay have a late write requirement (where write data appears a half orwhole clock cycle after the address), while the L2 memory does not.Because each bank of memory can be configured to be L2 or L3 memory,each bank of memory is configured to be able to support late write ifthis function is selected.

In some implementations, the memory controller 106 can configure thememory device 108 at start up to have a user-specified memoryconfiguration. For example, suppose the memory device 108 has 8 Mb oftotal memory, a first user may set 2 Mb of the total memory to be usedas L2 memory and 6 Mb of the total memory 108 as L3 memory, and a seconduser may set 7 Mb of the total memory to be used as L2 memory and 1 Mbof the total memory as L3 memory. For example, the system 100 can bebooted from a FLASH (NAND or NOR) device. After start-up, theuser-specified memory configurations (including memory controllerconfigurations and clock configurations) are written to controlregisters as part of the boot process.

For example, the L2 port 116 and L3 port 114 can access the sharedmemory device 108 in parallel as long as the accesses are not to thesame bank. Arbitration outside the shared memory device 108 can handlethe priority and conflicts posed by the sharing feature. For example,when single port memory implementations are used for each memory bank,overlapping L2 and L3 accesses to the same memory bank 114 are notallowed.

In some implementations, the memory controller 106 enablesre-configuration of the organization of the memory usage within activecycle times. This means that when a given application program startsworking and needs more memory, the application program can re-configurethe memory allocation on-the-fly, or with only a few cycles taken toswitch or re-configure the memory allocation. For example, the memoryarchitecture can be designed such that the first portion of the memoryis activated with one clock edge and the second portion of the memory isactivated with a different clock edge.

By supporting different data widths and access times, the system 100allows for application specific optimization of L2 and L3 memoryresources and improves die area (cost). The shared memory device 108 canhave multiple read and write ports of different data widths havingdifferent access timings per port. Each application can be optimizedindependently of the fixed memory size in hardware by trading off L2memory usage verses L3 memory usage. In addition, the user candynamically change the L2 verses L3 memory allocation as desired withoutincurring a configuration step or associated delay (except for a delayof a few cycles when switching from one clock domain to another).

Given only a single access can occur to a memory bank at a time,concurrent accesses by the L2 and L3 interfaces can only occur todifferent memory banks. To handle the potential of L2 and L3 accessconflict to the same bank, arbitration is used to insure no concurrentL2 and L3 accesses to the same bank.

The shared L2-L3 memory further improves system performance byeliminating the need for L3 to L2 memory transfers and vice-versa (whichmay be necessary if one of the processors 102 and 104 can only accessthe L2 or L3 memory). This means that the shared L2-L3 memory allowseither the first processor 102 or second processor 104 to operate on thesame data without requiring memory transfers, thereby improving systemefficiency. In the system 100, the latency of memory access for theprocessors 102 and 104 can be lowered, as compared to a conventionalsystem that uses a bridging process to convert memory requests from oneclock domain to another. Using a single memory module instead of twomemory modules reduces the area on the semiconductor chip, reducing thecost of the system 100.

These multiple memory banks may have the same or different native datawidths and bank access times. The memory banks 114 can include native32-bit banks, native 64-bit banks, or a combination of them. Factorssuch as access time, area, aspect ratio, power consumption, andpotential L2 versus L3 usage are taken into account when designing thememory banks.

The system 100 allows dynamic configuration of the memory organizationduring execution of an application program. For example, referring toFIG. 4, suppose there is a total of 6 Mb of shared memory, and eachmemory bank corresponds to 1 Mb. An application program may determinethat it needs 4 Mb of L3 memory and 2 Mb of L2 memory, and sendsinstructions to the memory controller 106 to reconfigure theorganization of the shared memory. As shown in FIG. 4, memory banks 114a to 114 d are configured as part of the L3 memory 110, each bankreceives the first clock signal CLK 1 (118) and is accessed through thefirst data bus 130 using an access timing scheme associated with the L3memory 110. Memory banks 114 e and 114 f are configured as part of theL2 memory 112, each bank receives the second clock signal CLK 2 (120)and is accessed through the second data bus 132 using an access timingscheme associated with the L2 memory 112.

During execution, the application program may determine that it needsmore L2 memory 112. For example, referring to FIG. 5, the applicationprogram may request an increase of L2 memory 112 to 4 mega bits and adecrease of L3 memory 110 to 2 mega bits, and sends instructions to thememory controller 106 to reconfigure the organization of the sharedmemory accordingly. After a few clock cycles (e.g., five clock cycles orless of the slower clock) of reconfiguring the memory device 108, memorybanks 114 a and 114 b are configured as part of the L3 memory 110, eachbank receives the first clock signal 118 and is accessed through thefirst data bus 130 using an access timing scheme associated with the L3memory 110. Memory banks 114 c to 114 f are configured as part of the L2memory 112, in which each bank receives the second clock signal 120 andis accessed through the second data bus 132 using an access timingscheme associated with the L2 memory 112.

The amount of time required for reconfiguring the ratio of L2 to L3memory can vary depending on the memory structure and the configurationof the memory controller 106. In some implementations, reconfiguring theratio of L2 to L3 memory may take more than five clock cycles tocomplete. In some implementations, reconfiguring the ratio of L2 to L3memory may require three cycles or less of the slower processor's clock.

In the example of FIGS. 4 and 5, the memory banks 1 to 6 can be accessedby either the first processor 102 or the second processor 104. There areseveral types of memory devices that can be used to implement the sharedmemory banks 1 to 6. For example, some memory devices are optimized for32-bit native accesses (herein referred to as 32-bit memory devices),and some memory devices are optimized for 64-bit native accesses (hereinreferred to as 64-bit memory devices). The shared memory banks 1 to 6can be implemented by (a) using only 32-bit memory devices, (b) usingonly 64-bit memory devices, or (c) using a combination of 32-bit and64-bit memory devices.

When the combination implementation is chosen, the optimal powerconsumption usage is to have the first processor 102 access the 32-bitmemory devices and the second processor 104 access the 64-bit memorydevices. The portion of the memory banks implemented using 32-bit memorydevices and the portion of the memory banks implemented using 64-bitmemory devices can be determined based on predicted usage of the firstand second processors. For example, if it is predicted that, for mostapplications, the first processor 102 will likely use 4 mega bits of L3memory and the second processor 104 will likely use 2 mega bits of L2memory, then memory banks 1 to 4 can be implemented using 32-bit memorydevices and memory banks 5 and 6 can be implemented using 64-bit memorydevices. It should be noted that this scheme is not limited to 32-bitand 64-bit memory devices. Memory devices with other native bus widths,such as 128-bit and 64-bit, 128-bit and 32-bit, 32-bit and 16-bit,64-bit and 16-bit, and other sizes not necessarily based on a power of 2(e.g., 48-bit and 24-bit), can also be adopted.

In the example above, during operation of the system 100, optimal powerconsumption usage can be achieved if the first processor 102 accessesmemory banks 1 to 4 and the second processor 104 accesses memory banks 5and 6 (as shown in FIG. 4). It is possible to reconfigure the memoryusage to allow the first processor 102 to access memory banks 1 and 2and allow the second processor 104 to access memory banks 3 to 6 (asshown in FIG. 5), but using the second processor 104 to access memorybanks 3 and 4 may increase power consumption because memory banks 3 and4 are not optimized for the 64-bit memory accesses used by the secondprocessor 104.

An advantage of the system 100 is that a person designing the system 100does not need to understand the implementation details of the memorydevice 108 in order to design the system to allow the first processor102 to access portions of the memory 108 according to the first clockfrequency and to allow the second processor 104 to access portions ofthe memory 108 according to the second clock frequency. Anotheradvantage of the system 100 is that, because the shared memory device isavailable to both processors and clock domains, there is no need totransfer memory contents between two interfaced clocks domains.

FIG. 6A is a diagram of an example memory controller 106. The memorycontroller 106 includes a control register interface 300 that receivesregister control data from a system controller (not shown) via aregister bus 301. The register control data can be used to configure thememory controller 106 and the memory banks, such as the sizes of the L2and L3 memory portions. The register control data can be retrieved froma FLASH memory during start-up. The control register interface 300 sendsinformation to a first address decoder 304 and a second address decoder306 to specify which memory addresses are associated with the L3 memorybanks 110 that are used exclusively by the first processor 102, L2memory banks 112 that are used exclusively by the second processor 104,and shared memory banks 290 that can be accessed by both processors 102and 104. In this example, data on the register bus 301 is synchronizedto the first clock signal CLK 1 (118).

A memory access request (e.g., read or write request) on the first databus 130 is received by the first address decoder 304, which determineswhether the memory access request is for the reserved L3 memory banks110 or shared memory banks 290. If the memory request is for thereserved L3 memory banks 110, the memory controller 106 accesses the L3memory banks 110 according to the memory request. Both the first databus 130 and the L3 memory banks 110 are synchronized to the first clocksignal CLK1 (118).

If the memory access request on the first data bus 130 is for the sharedmemory banks 290, the request is sent to a clock domain crossing andbank arbitration unit 302 through a bus 322. The shared memory banks 290have a first portion configured as L3 memory and a second portionconfigured as L2 memory. Depending on whether the memory access requestis for the first or second portion, the memory controller 106 accessesthe portion of the shared memory banks 290 using the appropriate clockfrequency and timing scheme. The clock domain crossing and bankarbitration unit 302 also arbitrates requests that access the samememory bank to prevent conflicts.

A memory access request on the second data bus 132 is received by thesecond address decoder 306, which determines whether the memory accessrequest is for the reserved L2 memory banks 110 or the shared memorybanks 290. If the memory request is for the reserved L2 memory banks110, the memory controller 106 accesses the L2 memory banks 112according to the memory request. Both the second data bus 132 and the L2memory banks 112 are synchronized to the second clock signal CLK2 (120).

If the memory access request on the second data bus 132 is for theshared memory banks 290, the request is sent to the clock domaincrossing and bank arbitration unit 302 through the bus 324. Depending onwhether the memory access request is for the first portion of the sharedmemory banks 290 (which is configured as L3 memory) or the secondportion of the shared memory banks 290 (which is configured as L2memory), the memory controller 106 accesses the portion of the sharedmemory banks 290 using the appropriate clock frequency and timingscheme.

In some implementations, the configurations of the memory controller 106and the memory device 108 can be changed by writing register datathrough the control register interface 300. For example, duringexecution of an application program, the application program may causethe system controller to write to registers in the memory controller 106to change the allocation of the memory banks to the L2 and L3 portions.

In some examples, the memory controller 106 monitors execution of memoryaccess instructions and determines when it is appropriate for a memorybank to switch between L2 and L3 configurations. When the memorycontroller 106 receives control register data via the control registerinterface 300 indicating that a memory bank is to switch from a firstclock frequency to a second clock frequency, the memory controller 106determines whether a previous memory access instruction for accessingthe memory bank (using the first clock frequency) has been completedbefore switching to the second clock frequency.

FIG. 6B is a diagram of the clock domain crossing and bank arbitrationunit 302, which uses a handshaking process between a first clock domainrequester 310 and a second clock domain requester 312 to enablearbitration of memory access requests sent from the first processor 102and the second processor 104. The first clock domain requester 310 is alogic circuit that is synchronized to the first clock signal CLK 1(118), and receives read and write memory access instructions from thefirst processor 102 through the bus 322. The second clock domainrequestor 312 is a logic circuit that is synchronized to the secondclock signal CLK 2 (120), and receives read and write memory accessinstructions from the second processor 104 through the bus 324.

The clock domain crossing and bank arbitration unit 302 allows eitherthe first processor 102 (which operates in the first clock domain) orthe second processor 104 (which operates in the second clock domain) toaccess the shared memory banks 290 through a shared memory bank bus 314.The clock domain currently granted the shared memory bank bus 314retains control of the bus 314 until a synchronized arbitration requestfrom the other clock domain is granted by the controlling clock domain.Both the arbitration request and the arbitration grant are synchronized.One clock domain requester, either clock domain requester 310 or 312,may then output the arbitration result to an arbitration multiplexer316, which sends a selected memory request to the designated sharedmemory bank 290.

For example, referring to FIGS. 17 and 18, suppose the shared memorybanks 290 include the memory banks 1 to 6 of the memory 108. Thefollowing describes a process in which the memory controller 106 isfirst configured to access the memory 108 as shown in FIG. 17, in whichthe first processor 102 accesses memory banks 1 to 4, and the secondprocessor 104 accesses the memory banks 5 and 6, then re-configured toaccess the memory 108 as shown in FIG. 18, in which the first processor102 accesses the memory banks 1 and 2, and the second processor 104accesses the memory banks 3 to 6. The memory banks in the shared memory290 that are accessed by the first processor 102 are referred to as L3memory banks in the shared memory 290, and the memory banks in theshared memory 290 that are accessed by the second processor 104 arereferred to as L2 memory banks in the shared memory 290.

An arbitration handshake process between processors 102 and 104 ishandled by the first clock domain requester 310 and the second clockdomain requester 312 in FIG. 6B to ensure that all memory accesses tothe memory banks 3 and 4 in the first clock domain (CLK 1) have beencompleted before the memory banks 3 and 4 are switched from beingaccessed by the first processor 102 to being accessed by the secondprocessor 104. The second clock domain requester 312 requests access tothe memory banks 3 and 4 by sending an arbitration request 318 to thefirst clock domain requester 310. A synchronization unit 320 a receivesthe request 318 from the second clock domain requester 312 andsynchronizes the request 318 with the first clock signal CLK 1 (118)before sending the request 318 to the first clock domain requestor 310.The first clock domain requestor 310 determines whether there arepending memory accesses to the memory banks 3 and 4. If there arepending memory accesses, the first clock domain requester 310 waitsuntil the memory accesses are completed, then sends an arbitrationgranted signal 322 to the second clock domain requestor 312. Thearbitration granted signal 322 is synchronized by a synchronization unit320 b. Afterwards, the memory banks 3 and 4 can be accessed by thesecond processor 104, as shown in FIG. 18.

When the memory controller 106 is configured to enable access of thememory banks 3 and 4 by the first processor 102 (as in FIG. 17), thearbitration multiplexer 316 allows memory access requests from the firstclock domain requester 310 to pass to the memory banks 3 and 4.Afterwards, if the first processor 102 accesses the memory banks 3 and4, the first clock domain requestor 312 sends the memory access requeststo the memory banks 3 and 4 directly without using the arbitrationprocess and the first processor 102 experiences no latency due toarbitration.

When the memory controller 106 is re-configured to allow the secondprocessor 104 to access memory banks 3 and 4 (as in FIG. 18), anarbitration process described above is used to ensure properswitch-over, and there is a delay while waiting for the arbitrationprocess to be completed. After the arbitration process is completed andthe memory controller 106 is re-configured to allow the second processor104 to access memory banks 3 and 4, the arbitration multiplexer 316allows memory access requests from the second clock domain requester 312to pass to the memory banks 3 and 4. Afterwards, if the second processor104 accesses the memory banks 3 and 4, the second clock domain requestor312 sends the memory access requests to the memory banks 3 and 4directly without using the arbitration process and the second processor104 experiences no latency due to arbitration.

Thus, as discussed above, continued accesses of a shared memory bankfrom a single clock domain experience no latency of arbitration.However, there is arbitration penalty for switching clock domains,including multiple clock domain synchronization delays.

Below are examples of access timing schemes for the L2 memory 112 (andmemory banks in the shared memory 290 accessed by the second processor104) and the L3 memory 110 (and memory banks in the shared memory 290accessed by the first processor 102).

FIG. 7 is an example L3 memory timing diagram 160. The diagram 160 showssignal levels for the interface timing and internal timing during thewrite cycle 140 and the read cycle 142 using a clock signal CLK_L3 asreference. The signals related to interface timing include a memory bankselect signal 144, an address and control signal 146, write data 148,and read data 150. The signals related to internal timing include aninput latched indication signal 152, a write data latched indicationsignal 154, a write access start signal 156, and a read access startsignal 158. In this example, all L3 operations, except write accessstart, are self timed from the falling edge of the clock signal CLK_L3.

FIG. 8 is an example L2 memory timing diagram 170. The diagram 170 showssignal levels for the interface timing and internal timing during thewrite cycle 172 and the read cycle 174 using clock signals CLK_L2 andEarly_CLK_L2 as references. The CLK_L2 clock signal is used to latchoutputs. The Early_CLK_L2 signal triggers write and read accesses. Inthis example, all L2 memory operations, except output data latching, areself timed from the falling edge of the clock signal Early_CLK_L2.

FIG. 9 is an example timing diagram 180 for the case where L2 and L3memory accesses are interleaved and directed to the same memory bank.The L2 memory uses the clock signal CLK_L2 as reference, and the L3memory uses the clock signal CLK_L3 as reference. In this example, theclock signals CLK_L2 and CLK_L3 have the same frequency.

FIG. 10 is an example timing diagram 190 for the case where L2 and L3memory accesses are directed to different memory banks and can occurconcurrently. The L2 memory uses the clock signal CLK_L2 as reference,and the L3 memory uses the clock signal CLK_L3 as reference. In thisexample, the clock signals CLK_L2 and CLK_L3 have the same frequency.

FIGS. 11A and 11B are example timing diagrams 200 and 210 for the casewhere L2 and L3 memory accesses are interleaved and directed to the samememory bank. The L2 memory uses the clock signal CLK_L2 as reference,the L3 memory uses the clock signal CLK_L3 as reference, and the clocksignal CLK_L2 has a frequency that is twice the frequency of the clocksignal CLK_L3.

FIG. 12 is an example timing diagram 220 for the case where L2 and L3memory accesses are directed to different memory banks and can occurconcurrently. The L2 memory uses the clock signal CLK_L2 as reference,the L3 memory uses the clock signal CLK_L3 as reference, and the clocksignal CLK_L2 has a frequency that is twice the frequency of the clocksignal CLK_L3.

FIGS. 13A and 13B are example timing diagrams 230 and 240 for the casewhere L2 and L3 memory accesses are interleaved and directed to the samememory bank. The L2 memory uses the clock signal CLK_L2 as reference,the L3 memory uses the clock signal CLK_L3 as reference, and the clocksignal CLK_L2 has a frequency that is three times the frequency of theclock signal CLK_L3.

FIG. 14 is an example timing diagram 250 for the case where L2 and L3memory accesses are directed to different memory banks and can occurconcurrently. The L2 memory uses the clock signal CLK_L2 as reference,the L3 memory uses the clock signal CLK_L3 as reference, and the clocksignal CLK_L2 has a frequency that is three times the frequency of theclock signal CLK_L3.

FIGS. 15A and 15B are example timing diagrams 260 and 270 for the casewhere L2 and L3 memory accesses are interleaved and directed to the samememory bank. The L2 memory uses the clock signal CLK_L2 as reference,the L3 memory uses the clock signal CLK_L3 as reference, and the clocksignal CLK_L2 has a frequency that is four times the frequency of theclock signal CLK_L3.

FIG. 16 is an example timing diagram 280 for the case where L2 and L3memory accesses are directed to different memory banks and can occurconcurrently. The L2 memory uses the clock signal CLK_L2 as reference,the L3 memory uses the clock signal CLK_L3 as reference, and the clocksignal CLK_L2 has a frequency that is four times the frequency of theclock signal CLK_L3.

It should be appreciated that various aspects of the present inventionmay be may be used alone, in combination, or in a variety ofarrangements not specifically discussed in the implementations describedin the foregoing and is therefore not limited in its application to thedetails and arrangement of components set forth in the foregoingdescription or illustrated in the drawings.

Although some examples have been discussed above, other implementationsand applications are also within the scope of the following claims.Various aspects of the invention described herein may be implemented inany of numerous ways. For example, the various components describedabove may be implemented in hardware, firmware, software or anycombination thereof. The memory device 108 can be divided into twoportions, and the two portions do not necessarily have to be accessed asL2 and L3 memory. The two portions can be treated as two L3 memorymodules, or two L2 cache memory modules, etc. The labeling of “L2memory” and “L3 memory” (or “system memory”) in the description is onlyfor illustration of the examples.

In the example of FIGS. 4 and 5, the boundaries of the L2 and L3 memoryalign with the boundaries of the memory banks. It is also possible todesign the system 100 so that the memory device 108 can be configuredaccording to blocks, which can include more than one memory bank, and donot necessarily align with memory banks. Each memory block can receive aseparate clock signal and can have a separate read/write access port.The timing diagrams for the L2 and L3 memory can be different from thoseshown in FIGS. 7-16.

In the examples of FIGS. 2-5, the multiplexers 128 are placed in thememory controller 106, and the selected clock signals are provided fromthe memory controller 106 to the memory banks 114. It is also possibleto place the multiplexers 128 outside of, but still controlled by, thememory controller 106.

In the examples of FIGS. 17 and 18, each of the six memory banks can beaccessed by the first processor 102 or the second processor 104. In someimplementations, the clock domain crossing and bank arbitration unit 302can have six arbitration multiplexers 316 each associated with aseparate shared memory bus 314 to allow concurrent access to the sixshared memory banks. In some implementations, only one arbitrationmultiplexer 315 and one shared memory bus 314 is used. This allows lessconcurrency of access, but improves access timing and reduces logiccomplexity.

The memory device 108 can be a dynamic random access memory (DRAM)device, a static random access memory (SRAM) device, or other types ofmemory. The widths of the buses 130 and 132 can be different from thosedescribed above. For example, one of the buses can be 128 bit wide, orwider.

1. An apparatus comprising: a first processor that accesses memoryaccording to a first clock frequency; a second processor that accessesmemory according to a second clock frequency; a memory deviceconfigurable to selectively operate according to the first clockfrequency or the second clock frequency; and a memory controller toenable dynamic configuration of organization of the memory device toallow a first portion of the memory device to be accessed by the firstprocessor according to the first clock frequency and a second portion ofthe memory device to be accessed by the second processor according tothe second clock frequency.
 2. The apparatus of claim 1 in which thememory controller enables re-configuration of the organization of thememory device to adjust the sizes of the first and second portions whilethe first processor is executing an application program.
 3. Theapparatus of claim 1 in which the memory controller enablesre-configuration of the organization of the memory device to adjust thesizes of the first and second portions upon start-up of the memorydevice and the memory controller.
 4. The apparatus of claim 1 in whichthe memory device comprises a plurality of memory banks and the memorycontroller allocates the first and second portions of the memory devicealong boundaries of memory banks.
 5. The apparatus of claim 4,comprising multiplexers each associated with a memory bank, eachmultiplexer selecting from a first clock signal having the first clockfrequency and a second clock signal having the second clock frequencyand passing the selected clock signal to the corresponding memory bank.6. The apparatus of claim 1 in which the memory controller enablesdynamic configuration of the organization of the memory device to allowa third portion of the memory device to be accessed exclusively by thefirst processor according to the first clock frequency.
 7. The apparatusof claim 6 in which the memory controller enables re-configuration ofthe memory device to adjust the sizes of the first and second portions,and the memory controller comprises an address decoder to receive memoryaccess requests from the first processor and determine whether thememory access requests are for accessing the first or third portion ofthe memory device.
 8. The apparatus of claim 1 in which the memorycontroller re-configures the organization of the memory device tore-allocate the sizes of the first and second portions of the memorydevice using five or less clock cycles according to the slower of thefirst and second clock frequencies.
 9. The apparatus of claim 1 in whichthe first portion of the memory device is accessed using a first buswidth, and the second portion of the memory device is accessed using asecond bus width.
 10. The apparatus of claim 1 in which the memorycontroller monitors execution of memory access instructions, and uponreceiving a signal to switch from a first memory access clock frequencyto a second memory access clock frequency for accessing a segment of thememory device, determines whether a previous memory access instructionusing the first memory access clock frequency for the segment of thememory device has been completed before switching to the second memoryaccess clock frequency.
 11. The apparatus of claim 1 in which the memorycontroller comprises an arbitration unit to arbitrate accesses to sharedmemory banks of the memory by the first and second processors.
 12. Theapparatus of claim 1 in which the memory controller comprises a firstclock domain requester that operates according to the first clockfrequency and passes memory access instructions from the first processorto the shared memory banks, and a second clock domain requester thatoperates according to the second clock frequency and passes memoryaccess instructions from the second processor to the shared memorybanks, wherein only one of the first and second clock domain requestorsis granted access to any particular shared memory bank at a given time.13. The apparatus of claim 12 in which the first clock domain requestoris granted access to a particular shared memory bank until the secondclock domain requester requests access to the particular shared memorybank, upon which the first and second clock domain requestors perform anarbitration process to determine whether the second clock domainrequester can be granted access to the particular shared memory bank.14. The apparatus of claim 13 in which the second clock domain requestersends an arbitration request signal to the first clock domain requester,and the first clock domain requester sends an arbitration grant signalto the second clock domain requestor after the first clock domainrequester determines that all memory access requests from the firstprocessor to the particular shared memory bank have been completed. 15.The apparatus of claim 14 in which the memory controller comprises oneor more synchronization units that synchronize the arbitration requestsignal and arbitration grant signal across different clock domains. 16.The apparatus of claim 1 in which the memory device is configurable toselectively operate according to three or more clock frequencies, andthe memory controller enables dynamic configuration of the memory deviceto allow a portion of the memory device to be accessed according to anyof the clock frequencies in which the memory device is operable.
 17. Theapparatus of claim 1 in which the memory controller automaticallydetermines an organization of the memory device based on which processoris accessing the memory and the clock frequency of the access.
 18. Theapparatus of claim 17 in which the memory controller enables dynamicconfiguration of the memory to allow the entire memory to be accessibleby the first processor according to the first clock frequency or thesecond processor according to the second clock frequency.
 19. Anapparatus comprising: a first processor that accesses memory accordingto a first timing scheme; a second processor that accesses memoryaccording to a second timing scheme; a memory device having sharedmemory banks that can be accessed by either the first or secondprocessors, the shared memory banks configurable to selectively operateaccording to the first timing scheme or the second timing scheme; and amemory controller to enable dynamic configuration of organization of theshared memory banks to allow a first set of shared memory banks to beaccessed by the first processor according to the first timing scheme anda second set of the shared memory banks to be accessed by the secondprocessor according to the second timing scheme.
 20. The apparatus ofclaim 19 in which the first processor accesses the shared memory banksaccording to a first clock frequency, and the second processor accessesthe shared memory banks according to a second clock frequency.
 21. Theapparatus of claim 20 in which the memory controller comprises a a firstclock domain requester that operates according to the first clockfrequency and passes memory access instructions from the first processorto the shared memory banks, and a second clock domain requester thatoperates according to the second clock frequency and passes memoryaccess instructions from the second processor to the shared memorybanks, wherein only one of the first and second clock domain requestorsis granted access to any particular shared memory bank at a given time.22. The apparatus of claim 21 in which when the first processor isgranted access to a particular shared memory bank, and the secondprocessor seeks access to the particular shared memory bank, the secondclock domain requester sends an arbitration request signal to the firstclock domain requester, and the first clock domain requester sends anarbitration grant signal to the second clock domain requester after thefirst clock domain requester determines that memory access requests fromthe first processor to the particular shared memory bank have beencompleted.
 23. The apparatus of claim 22 in which the memory controllercomprises one or more synchronization units that synchronize thearbitration request signal and arbitration grant signal across differentclock domains.
 24. A method comprising: dynamically configuringorganization of a memory device to allow a first portion of the memorydevice to be accessed by a first processor according to a first timingscheme and a second portion of the memory device to be accessed by asecond processor according to a second timing scheme; receiving a firstmemory access instruction from the first processor; accessing the firstportion of the memory device responsive to the first memory accessinstruction according to the first timing scheme; receiving a secondmemory access instruction from the second processor; and accessing thesecond portion of the memory device responsive to the second memoryaccess instruction according to the second timing scheme.
 25. The methodof claim 24 in which accessing the first portion of the memory deviceaccording to the first timing scheme comprises accessing the firstportion of the memory device according to a first clock frequency, andaccessing the second portion of the memory device according to thesecond timing scheme comprises accessing the second portion of thememory device according to a second clock frequency.
 26. The method ofclaim 25, comprising, for each memory bank in the memory device,selecting one of a first clock signal having the first clock frequencyand a second clock signal having the second clock frequency, and passingthe selected clock signal to the memory bank.
 27. The method of claim24, comprising re-configuring the organization of the memory device toadjust the sizes of the first and second portions while executing anapplication program by the first processor.
 28. The method of claim 24,comprising re-configuring the organization of the memory device toadjust the sizes of the first and second portions upon start-up of thememory device and the memory controller.
 29. The method of claim 24,comprising allocating the first and second portions of the memory devicealong boundaries of memory banks of the memory device.
 30. The method ofclaim 24, comprising re-configuring the memory device to re-allocate thesizes of the first and second portions of the memory device using fiveor less clock cycles according to the slower of the first and secondclock frequencies.
 31. The method of claim 24, comprising accessing thefirst portion of the memory device using a first bus width, andaccessing the second portion of the memory device using a second buswidth.
 32. The method of claim 24 in which the memory controllermonitors execution of memory access instructions, and upon receiving asignal to switch from a first memory access timing scheme to a secondmemory access timing scheme, determines whether memory accessinstructions associated with the first memory access timing scheme havebeen completed before switching to the second memory access timingscheme.
 33. A method comprising: dynamically configuring organization ofa memory device having shared memory banks that are shared between afirst processor and a second processor to allow a first set of theshared memory banks to be accessed by the first processor according to afirst clock frequency and a second set of the shared memory banks to beaccessed by the second processor according to a second clock frequency;and arbitrating requests for accessing the shared memory banks by thefirst and second processors, the requests from the first processor beingsynchronized according to the first clock frequency and the requestsfrom the second processor being synchronized according to the secondclock frequency.
 34. The method of claim 33, comprising granting accessto a particular shared memory bank to the first processor until thesecond processor requests access to the particular shared memory bank,and performing an arbitration handshake to determine whether the secondprocessor is granted access to the particular shared memory bank. 35.The method of claim 34 in which performing the arbitration handshakecomprises sending an arbitration request signal from a second clockdomain requester to a first clock domain requester, and sending anarbitration grant signal from the first clock domain requester to thesecond clock domain requester after the first clock domain requesterdetermines that memory access requests from the first processor to theparticular shared memory bank have been completed.
 36. The method ofclaim 35, comprising granting access to the particular shared memorybank to the second processor until the first processor requests accessto the particular shared memory bank, sending an arbitration requestsignal from the first clock domain requester to the second clock domainrequester, and sending an arbitration grant signal from the second clockdomain requester to the first clock domain requester after the secondclock domain requester determines that memory access requests from thesecond processor to the particular shared memory bank have beencompleted.
 37. The method of claim 35, comprising synchronizing thearbitration request signal and the arbitration grant signal acrossdifferent clock domains.
 38. An apparatus comprising: a memory devicehaving a plurality of portions each configurable to operate according tomultiple timing schemes; and means for enabling dynamic configuration oforganization of the memory device to allow each portion of the memorydevice to be dynamically configured to be accessed according to a firsttiming scheme or a second timing scheme while executing an applicationprogram.